Classifying with confidence from incomplete information
نویسندگان
چکیده
We consider the problem of classifying a test sample given incomplete information. This problem arises naturally when data about a test sample is collected over time, or when costs must be incurred to compute the classification features. For example, in a distributed sensor network only a fraction of the sensors may have reported measurements at a certain time, and additional time, power, and bandwidth is needed to collect the complete data to classify. A practical goal is to assign a class label as soon as enough data is available to make a good decision. We formalize this goal through the notion of reliability—the probability that a label assigned given incomplete data would be the same as the label assigned given the complete data, and we propose a method to classify incomplete data only if some reliability threshold is met. Our approach models the complete data as a random variable whose distribution is dependent on the current incomplete data and the (complete) training data. The method differs from standard imputation strategies in that our focus is on determining the reliability of the classification decision, rather than just the class label. We show that the method provides useful reliability estimates of the correctness of the imputed class labels on a set of experiments on time-series data sets, where the goal is to classify the time-series as early as possible while still guaranteeing that the reliability threshold is met.
منابع مشابه
Optimal (R, Q) policy and pricing for two-echelon supply chain with lead time and retailer’s service-level incomplete information
Many studies focus on inventory systems to analyze different real-world situations. This paper considers a two-echelon supply chain that includes one warehouse and one retailer with stochastic demand and an up-to-level policy. The retailer’s lead time includes the transportation time from the warehouse to the retailer that is unknown to the retailer. On the other hand, the warehouse is unaware ...
متن کاملA Target Classification Decision Aid
A submarine's sonar team is responsible for detecting, localising and classifying targets using information provided by the platform's sensor suite. The information used to make these assessments is typically uncertain and/or incomplete and is likely to require a measure of confidence in its reliability. Moreover, improvements in sensor and communication technology are resulting in increased am...
متن کاملOn Incomplete XML Documents with Integrity Constraints
We consider incomplete specifications of XML documents in the presence of schema information and integrity constraints. We show that integrity constraints such as keys and foreign keys affect consistency of such specifications. We prove that the consistency problem for incomplete specifications with keys and foreign keys can always be solved in NP. We then show a dichotomy result, classifying t...
متن کاملTaxonomy of Global Air Transport
Data from the United Nations and the International Civil Aviation Organization Information Systems were used as a base for characterizing, classifying and comparing air transport demand and supply features of 156 countries. Relevant data from 1980 were chosen to reflect five sets of characteristics namely, air transport, 50cm-economic status, population demography, geographical and environmenta...
متن کاملFuzzy multi-criteria decision making method based on fuzzy structured element with incomplete weight information
The fuzzy structured element (FSE) theory is a very useful toolfor dealing with fuzzy multi-criteria decision making (MCDM)problems by transforming the criterion value vectors of eachalternative into the corresponding criterion function vectors. Inthis paper, some concepts related to function vectors are firstdefined, such as the inner product of two function vectors, thecosine of the included ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 14 شماره
صفحات -
تاریخ انتشار 2013